Efficient Incremental Mining of Top-K Frequent Closed Itemsets

نویسندگان

  • Andrea Pietracaprina
  • Fabio Vandin
چکیده

In this work we study the mining of top-K frequent closed itemsets, a recently proposed variant of the classical problem of mining frequent closed itemsets where the support threshold is chosen as the maximum value sufficient to guarantee that the itemsets returned in output be at least K. We discuss the effectiveness of parameter K in controlling the output size and develop an efficient algorithm for mining top-K frequent closed itemsets in order of decreasing support, which exhibits consistently better performance than the best previously known one, attaining substantial improvements in some cases. A distinctive feature of our algorithm is that it allows the user to dynamically raise the value K with no need to restart the computation from scratch.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Top-k-FCI: Mining Top-K Frequent Closed Itemsets in Data Streams

With the generation and analysis of stream data, such as network monitoring in real time, log records, click streams, a great deal of attention has been concerned on data streams mining in the field of data mining. In the process of the data streams mining, it is more reasonable to ask users to set a bound on the result size. Therefore, in this paper, an real-time single-pass algorithm, called ...

متن کامل

An Efficient Incremental Algorithm to Mine Closed Frequent Itemsets over Data Streams

The purpose of this work is to mine closed frequent itemsets from transactional data streams using a sliding window model. An efficient algorithm IMCFI is proposed for Incremental Mining of Closed Frequent Itemsets from a transactional data stream. The proposed algorithm IMCFI uses a data structure called INdexed Tree(INT) similar to NewCET used in NewMoment[5]. INT contains an index table Item...

متن کامل

Incremental updates of closed frequent itemsets over continuous data streams

Online mining of closed frequent itemsets over streaming data is one of the most important issues in mining data streams. In this paper, we propose an efficient one-pass algorithm, NewMoment to maintain the set of closed frequent itemsets in data streams with a transaction-sensitive sliding window. An effective bit-sequence representation of items is used in the proposed algorithm to reduce the...

متن کامل

Mining Top-k Frequent Closed Itemsets in Data Streams Using Sliding Window

Frequent itemset mining has become a popular research area in data mining community since the last few years. There are two main technical hitches while finding frequent itemsets. First, to provide an appropriate minimum support value to start and user need to tune this minimum support value by running the algorithm again and again. Secondly, generated frequent itemsets are mostly numerous and ...

متن کامل

A comprehensive method for discovering the maximal frequent set

The association rule mining can be divided into two steps.The first step is to find out all frequent itemsets, whose occurrences are greater than or equal to the user-specified threshold.The second step is to generate reliable association rules based on all frequent itemsets found in the first step. Identifying all frequent itemsets in a large database dominates the overall performance in the a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007